|
In theoretical computer science and formal language theory, a regular expression (abbreviated regex or regexp and sometimes called a rational expression) is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. The concept arose in the 1950s, when the American mathematician Stephen Kleene formalized the description of a ''regular language'', and came into common use with the Unix text processing utilities ed, an editor, and grep (global regular expression print), a filter. Regular expressions are so useful in computing that the various systems to specify regular expressions have evolved to provide both a ''basic'' and ''extended'' standard for the grammar and syntax; ''modern'' regular expressions heavily augment the standard. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of , such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python, POSIX C and C++ (since C++11). Most other languages offer regular expressions via a library. ==Patterns== Each character in a regular expression is understood to be: a metacharacter (with its special meaning), or a regular character (with its literal meaning). Together, they can be used to identify textual material of a given pattern, or process a number of instances of it. Pattern-matches can vary from a precise equality to a very general similarity (controlled by the metacharacters). The metacharacter syntax is designed specifically to represent prescribed targets in the most concise and flexible way to direct the automation of text processing of general text files, specific textual forms, or of random input strings. A very simple use of a regular expression would be: to locate the same word spelled two different ways in a text editor (for example the regular expression matches both "serialise" and "serialize"). Wildcards could also achieve this, but are more limited in what they can pattern (having fewer metacharacters and a simple language-base). A usual context of wildcard characters is in globbing similar names in a list of files, whereas regular expressions are usually employed in applications that pattern-match text strings in general. For example, the regexp matches excess whitespace at the beginning or end of a line. An advanced regexp used to match any numeral is . See ''Examples'' for more examples. A regular expression processor translates a regular expression into a nondeterministic finite automaton (NFA), which is then made deterministic and run on the target text string to recognize substrings that match the regular expression. The picture shows the NFA scheme ''N''(''s'') obtained from the regex ''s'', where ''s'' denotes a simpler regex in turn, which has already been recursively translated to the NFA ''N''(''s''). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「regular expression」の詳細全文を読む スポンサード リンク
|